Smoking, mortality, access to diagnosis, and treatment of lung cancer in Brazil

ABSTRACT INTRODUCTION Lung cancer (LC) is a relevant public health problem in Brazil and worldwide, given its high incidence and mortality. Thus, the objective of this study is to analyze the distribution of smoking and smoking status according to sociodemographic characteristics and disparities in access, treatment, and mortality due to LC in Brazil in 2013 and 2019. METHOD Retrospective study of triangulation of national data sources: a) analysis of the distribution of smoking, based on the National Survey of Health (PNS); b) investigation of LC records via Hospital-based Cancer Registry (HCR); and c) distribution of mortality due to LC in the Mortality Information System (SIM). RESULTS There was a decrease in the percentage of people who had never smoked from 2013 (68.5%) to 2019 (60.2%) and in smoking history (pack-years). This was observed to be greater in men, people of older age groups, and those with less education. Concerning patients registered in the HCR, entry into the healthcare service occurs at the age of 50, and only 19% have never smoked. While smokers in the population are mainly Mixed-race, patients in the HCR are primarily White. As for the initial stage (I and II), it is more common in White people and people who have never smoked. The mortality rate varied from 1.00 for people with higher education to 3.36 for people without education. Furthermore, White people have a mortality rate three times higher than that of Black and mixed-race people. CONCLUSION This article highlighted relevant sociodemographic disparities in access to LC diagnosis, treatment, and mortality. Therefore, the recommendation is to strengthen the Population-Based Cancer Registry and develop and implement a nationwide LC screening strategy in Brazil since combined prevention and early diagnosis strategies work better in controlling mortality from the disease and continued investment in tobacco prevention and control policies.


INTRODUCTION
Lung cancer (LC) has been the most common malignancy worldwide since 1985, both in incidence and mortality 1 .In 2020, it was estimated that around 12% of new cancer cases were attributed to LC, which was responsible for 18.4% of cancer deaths worldwide 1 .Furthermore, there is high lethality and low survival after diagnosis, especially at an advanced stage 2 .
In Brazil, estimates for the period 2023-2025 indicate that LC will be the second most common type of cancer, not counting specific breast and prostate malignancies 3 .In 2020, it was also responsible for the leading cause of cancer-related mortality 4 .
The main risk factor for the development of LC is smoking 5 , often associated with age, due to the length of exposure to tobacco 4 .Other related factors are occupational risk 6 , environmental exposure 7 , and sociodemographic characteristics, such as sex 8 , race 9 , and education 10 .Notably, in Brazil, there is a gap in the literature regarding factors other than smoking.
Understanding how these factors affect the risk of developing LC, their association with tobacco consumption in Brazil, and the possible impact on mortality is essential for adequately formulating public policies.A promising way to reduce LC disparities is to improve prevention and detection of the disease at an early stage 11 , mainly through screening programs for people in at-risk groups 12 .The criteria for establishing an adequate screening program depend on understanding the degree of vulnerability of a population and the disparities in their access to diagnosis and treatment 12 .
Age and smoking status are generally the main criteria for defining high-risk groups in LC screening guidelines.However, these parameters must be adjusted considering the demographic composition, specificities, and other relevant markers such as socioeconomic situation 12 .
For instance, the Chinese guidelines were adapted concerning age and smoking status (currently 50 to 74 years old, with 30 pack-years) and consider the effects of air pollution due to environmental exposure and genetic factors 13 .In the USA, the United States Preventive Services Task Force (USPSTF) guidelines were updated between 2013 and 2021 to mitigate access problems for black people.The age range was expanded to 50 to 80 years (previously 55 to 80), and the smoking burden was reduced from 30 to 20 pack-years 14 .
A valuable method for understanding the population at risk is triangulation through secondary databases, which, in addition to being easily accessible and with national coverage, allow for sociodemographic characterization and the assessment of health aspects and lifestyle habits.The Mortality Information System (SIM) is systematically used in Brazil to study LC 4,15,16 .However, two significant challenges remain: estimating the risk of developing LC and measuring its disparities in diagnosis and treatment.
The most critical item for risk estimation is calculating the smoking burden, which is possible using data from the National Survey of Health (PNS) 17 .Although the Vigilância de Fatores de Risco e Proteção para Doenças Crônicas por Inquérito Telefônico (VIGITEL -Surveillance of Risk and Protective Factors for Chronic Diseases by Telephone Survey) provides specific information about chronic diseases and their risk factors, its application is only carried out in capital cities, making it difficult to extrapolate results at a national level 18 .On the other hand, disparities in diagnosis and treatment can be mapped using the Hospital-Based Cancer Registry (HCR), which enables analysis according to sociodemographic, epidemiological, and clinical characteristics 19 .
This article aims to analyze the population distribution of smoking, smoking burden, LC cases and mortality, and disparities in its access and treatment, according to sociodemographic characteristics in Brazil from 2013 to 2019, to support adequate screening of LC in the country.

METHODS
A retrospective study triangulated the PNS, HCR, and SIM as national secondary data sources.The HCR is a national database for systematic and continuous information collection from patients treated in hospital units with a confirmed cancer diagnosis.Sending data is mandatory for hospitals qualified in Specialized Oncology Care of the Brazilian Unified Health System (SUS) and optional for those not qualified 20 .
Base triangulation was carried out following the following steps: a. population analysis of the distribution of smoking (smokers, ex-smokers, smoking history, and never smoked) -PNS (2013 and 2019).
b. investigation of diagnosed LC cases according to smoking status and staging (initial and advanced) -HCR Regarding the definition/elaboration of the variables arising from the PNS, the following question was used to identify smokers: "Do you currently smoke any tobacco products?"Those who answered "Yes, daily" or "Yes, less than daily" were considered smokers.Former smokers were identified based on the question: "And in the past, did you smoke any tobacco products?"considering the answers "Yes, daily" or "Yes, less than daily."Non-smokers were those who answered "No, I have never smoked" to the previous question. https://doi.org/10.11606/s1518-8787.2024058005704 The smoking burden was estimated for people who smoked daily and who consumed "only industrialized cigarettes," "only straw or hand-rolled cigarettes," or both.Sporadic smokers and ex-smokers did not answer questions about the age at which they started smoking and the amount consumed.Approximately 10% of ex-smokers (n = 2,102) reported having quit smoking before establishing a habit, being disregarded in the calculation of smoking burden.
The smoking burden was estimated in pack-years.This number is a synthetic measure that combines the duration and intensity of smoking: the smoking burden of a pack-year corresponds to the daily smoking of 20 cigarettes for a year 22 .The conversion was carried out to equate industrialized cigarettes with straw or hand-rolled cigarettes, whose consumption was multiplied by three.Despite the lack of robust literature on the equivalence between them, experts point out that one straw cigarette is equivalent to three industrial cigarettes 23,24 .
The result was classified into the categories: a) up to 19 pack-years, b) from 20 to 29 packyears, and c) 30 pack-years or more.

Methodological aspects of SIM corrections
Missing data on sex and age was imputed.The most frequent response category in the database (male -810 cases) was used to impute sex gaps.Regarding age, out of 215,247 LC deaths in individuals over 18 years (2013 -2019), 4,719 unfilled cases were imputed using the median age of valid cases.
The garbage codes were corrected, checking the codes of interest LC (ICD C34) and their percentage of redistribution proposed by Malta et al. 16 .The distribution of garbage codes was recorded to assess their absolute frequency in each subgroup, according to the macro-region of residence and age, to be used in the redistribution destination.After this survey, n = 4,719 garbage code-type deaths were redistributed.
Deaths coded as ill-defined causes refer to chapter "R" of ICD-10 (R00-R99).The redistribution was carried out according to the proportional distribution of causes, verified among the ill-defined causes investigated and reclassified, according to the coefficient proposed by França et al. 15 , according to region, age group, and sex.After investigation for ICD-C34, 12,992 ill-defined causes were reassigned.
The last correction step in SIM was implementing the correction factor for under-registration of deaths, according to the methodology proposed by the Rede Interagencial de Informações para a Saúde (RIPSA -Interagency Health Information Network) 25 .The total quantified after correction was 10,176 LC deaths compared with those reported initially (n = 203,986).
After carrying out the procedures mentioned above, the SIM correction percentage was 0.03% for the sex variable, 2% for age, 2% for garbage codes, 6% for ill-defined causes, and 4.7% for sub-registration.

RESULTS
Table 1 presents the percentage distribution of smokers, ex-smokers, people who have never smoked, and smoking burden according to sociodemographic variables in 2013 and 2019.
There was a reduction in the smoking burden between 2013 and 2019, and women consistently smoked less than men.Male participation among smokers increased, which was already high in 2013 (57.2%), with a significant smoking burden of over 20 pack-years (63.7%).The most significant portion of those who "never smoked" were female in 2013 (62.3%) and 2019 (56.6%).The distribution of ex-smokers by sex is similar in both years.An increase in the percentage of ex-smokers aged 60 and over was identified between 2013 and 2019.Conversely, in 2019, the smoking burden among people in this age group was higher.The internal distribution of smokers and smoking burden according to race/color is similar to the general distribution of the population in both years.Considering race/color, Black or mixed-race individuals have a lower smoking burden when compared with white counterparts, especially in 2019.The high smoking burden among people with less education in both years is also noteworthy.concerning the population distribution (PNS).Regarding education, the high percentage of incomplete completion of this variable in the years (25.5% and 21.4%) draws attention to the detriment of its importance as a socioeconomic proxy for analyzing disparities in access.
Table 3 shows the level of LC staging in HCR patients in 2013 and 2019.The low percentage of people with initial staging levels (I and II) in 2013 (14.2%) and 2019 (14.5%) stands out.The distribution of sociodemographic characteristics of the population accessing services is similar between the years analyzed.In 2019, the percentage of white people with higher education and who never smoked was higher in stages I and II (58.2%, 6.8%, and 22.6%).Regarding treatment, a high percentage of first surgical treatment was observed in people with stages I and II in 2013 (45.6%) and 2019 (50.2%).
Table 4 shows the percentage distribution by sex, race/color, and education, which changed little between 2013 and 2019, and the mortality rate from LC in Brazil.There is a slight  Incomplete higher education 0.9 0.9 Complete higher education 5.6 5.9 5.9 6.5 6.7 Concerning age groups, mortality rates increased progressively with age, especially from the age of 60 onwards.Among those aged 49 years or younger in 2019, there was a rate of 0.11, which increased to 6.25 among those aged 60 to 69 years old and 16.65 among those aged 80 or older.
In 2019, there was a rate more than twice as high for White people (3.32) compared to Black people (1.27), mixed-race people (1.31), and Yellow or Indigenous people (1.38).Mortality progressively decreased with increasing education.In 2019, people with no education had a rate of 3.36, while for people with incomplete higher education, it reached 0.44, and for those who completed higher education, 1.00.
Table 5 summarizes the results of the data sources used.The proportion of smokers was noted to decrease proportionally with advancing age, and the mortality rate from LC increased inversely.A higher proportion of smokers was found in groups with less education, and similarly, the mortality rate was also higher in this population.Meanwhile, in the HCR, which includes individuals who have received a diagnosis, those aged 80 years or older represented 6.5%, and concerning deaths (SIM), this percentage was 20.2%.

DISCUSSION
In the general population, this study confirmed the worldwide downward trend in the prevalence of smokers 26 , as found in Brazilian capitals from VIGITEL 18 .Contrary to what was mentioned in that telephone survey, there was no continuity in reducing the prevalence of male smokers.Furthermore, the reduction only occurs in age groups up to 59 years old, among White and Yellow or Indigenous people.Regarding education, the most significant percentage reduction was observed among individuals with no education, with an increase among those with more education.It is essential to mention the reduction in the percentage of those who have never smoked, which may highlight the challenges in the sustainability of tobacco control policies 27 .
Regarding the smoking burden, between 2013 and 2019, a decrease was noted in the proportion of those who smoked more than 20 pack-years.In addition, women were noted to smoke less intensely than men, and individuals with less education have a more significant smoking burden.The international literature supports such findings 26,28 .
Behavioral patterns in tobacco consumption also varied by race/color.Black and mixedrace people have a lower smoking burden than White people.Despite this, they have a higher proportion of LC cases diagnosed at an advanced stage and, consequently, with a worse prognosis.This finding is similar to research carried out in the USA 9 .There are several hypotheses to explain this difference, including variation in metabolism 9 and overlapping socioeconomic factors, such as a diet with insufficient amounts of fruits and vegetables 9 .
As expected, in the HCR, the percentage of individuals with LC who have never smoked is much lower than that found in the general population (PNS), proving that smoking is a critical risk factor for the development of LC 1 .Access to diagnosis and treatment of the disease is concentrated in the age groups between 50 and 80 years old.Regarding smoking among those with LC in the HCR, men are predominant.Women are the majority among those who have never smoked.A significant disparity in access to LC diagnosis and treatment was noted as, in the general population, mixed-race people predominantly with a history of smoking (PNS), and among those with LC registered at the HCR, the majority are White.
LC is still one of the leading preventable causes of death in the country and worldwide 1 .
When analyzing deaths by SIM, the mortality rate from the disease is considerably higher for White people, in contrast to the evidence, which points to higher mortality rates among Black and mixed-race people as a result of inequality in access to timely diagnosis and treatment 29,30 .
Some factors may explain the discrepancy.The predominance of White people in the HCR likely highlights the difficulty in accessing treatment for Black and mixed-race people.
In SIM, there may be errors in the registration of LC as the underlying cause of death 31 .Furthermore, another possible explanation could be the differences in the mortality profile between White and Black people, with the latter being more affected by external and infectious causes, leading to premature death 32 .It should also be noted that the registered race/color variable is mentioned in the various information systems, either by the individual or the health professional.Knowing that there is a collinearity between race/color and education 33 , these hypotheses are raised due to the inverse association identified in this research between education and the mortality rate due to LC since it progressively decreases with increasing years of education.Future studies should analyze this issue to elucidate this apparent discrepancy in Brazilian records.
Some corrections were made to analyze the SIM and HCR data.Despite the small effect, corrections are essential for better completeness of information in the SIM.The correction at the HCR for staging was fundamental, preventing a 32% missing.
This study uses secondary data sources and, therefore, is restricted to the variables existing in the databases and their quality.In the PNS, the calculation of smoking status and smoking burden is based on self-reported information, with possible memory bias, mainly concerning the age at which smoking started.Furthermore, there is also no measurement of the starting date of smoking for occasional smokers.The HCR covers hospital records and is not population-based.Additionally, even though it has smoking information, it does not include the smoking burden, making it impossible to know the smoking intensity for those diagnosed with LC.Another limitation is the differences in measuring race/color between the different sources used.
Notably, this is the first study using triangulation of national data sources for different time points.This allowed us to raise essential hypotheses about access to LC diagnosis and treatment in Brazil.Thus, this article made it possible to analyze the population distribution of smoking between 2013 and 2019.It also highlighted relevant sociodemographic disparities in access to diagnosis, treatment, and mortality due to LC.
Strengthening the Population-Based Cancer Registry (PBCR) 34 is recommended to advance knowledge about LC in the country, expanding its coverage in Brazilian municipalities, as it enables linkage between data from various information systems and, thus, obtaining global, longitudinal, and accurate data, which leads to more specific analyses focusing on vulnerable populations to reduce disparities.
Furthermore, this work highlights the importance of feasibility studies on implementing an LC screening strategy in Brazil, as a high percentage of diagnosis was found at an advanced stage 2,4 .Studies indicate that combined prevention and early diagnosis strategies tend to work better in controlling mortality due to LC 35 .
Finally, it is highlighted that the approach to LC and its risk factors is multifaceted.It involves strengthening information systems to measure and reduce disparities, intervene for diagnosis and treatment at an early stage, and continually invest in tobacco control policies.

Methodological aspects of estimating smoking and smoking status in the PNS
20(2013 and 2019).c. distribution of deaths and the mortality rate of LC -SIM (2013 to 2019).
21e following were carried out to improve the quality of the data used: redistribution procedures for missing smoking and race/color data by federative unit (FU) in the HCR; correction of missing data in the staging variable (HCR) by redistribution, according to sex, age group, smoking status, and first treatment; and, in SIM, correction of garbage codes for LC, correction of ill-defined causes, correction of under-registration and redistribution of missing data by sex, age and UF.The International Classification of Disease (ICD) garbage codes refer to undefined or incomplete diagnoses that do not accurately indicate the cause of death or hospitalization21.All analyses were described according to demographic and socioeconomic variables: sex, age group, race/color, and education.

Table 1 .
Percentage distribution of the population by sociodemographic characteristics, according to smoking status and smoking burden in the National Survey of Health (PNS).Brazil, 2013 and 2019.

Table 2
shows the percentage distribution of smokers, ex-smokers, and people who never smoked among patients registered in the HCR in 2013 and 2019.In the HCR, the percentage of people who never smoked was, on average, 19%, considerably lower than that observed in the general population (68.5% in 2013 and 60.2% in 2019).Entry into the health service starts at 50 years of age, and the percentage of people in the most advanced age group (80 years or older) is meager compared with other ages, especially among smokers (3.4% in both years).As in the PNS, in the sex distribution of smokers and ex-smokers, men are predominant, both in 2013 (62% and 65.2%) and in 2019 (58.8% and 65.4%).Women are the majority of those who have never smoked (61.3% in 2013 and 63.2% in 2019).The difference between the distribution by race/color stands out, with a predominance of mixed-race people in the PNS (49% in 2013 and 50.4% in 2019) and most White patients in the HCR (56.5% in 2013 and 55,3% in 2019).Despite the high percentage of brown and ex-smokers in the general population (Table1), they are not captured in the same way in the HCR, indicating an underrepresentation.On the other hand, White ones, even ex-smokers or those who have never smoked, have more access to diagnosis and treatment, representing a more significant portion of the HCR

Table 2 .
Percentage distribution of patients diagnosed with lung cancer in the Hospital-based Cancer Registry (HCR) by sociodemographic characteristics, according to smoking status.Brazil, 2013 and 2019.

Table 3 .
Percentage distribution of patients diagnosed with lung cancer in the Hospital-based Cancer Registry (HCR) by sociodemographic characteristics, according to initial (I and II) and advanced (III and IV) staging levels.Brazil, 2013 and 2019.
deaths in the younger age groups.For people aged 49 years or younger, the percentage goes from 8.4% in 2013 to 5.6% in 2019 and, among those aged 50 to 59 years old, from 17.9% to 14.6%.The mortality rate per 10 thousand inhabitants also varied little during the period, and despite showing a slight reduction between 2013 (1.50) and 2014 (1.46), it rose again, reaching 1.55 in 2019.In every year, a higher rate was noted for men (1.79 in 2019) than for women(1.32 in 2019).

Table 5 .
Percentage distribution by socioeconomic characteristics of the population (National Survey of Health [PNS]) and lung cancer cases (Hospital-based Cancer Registry [HCR]), according to smoking status, deaths, and mortality rate (Mortality Information System [SIM]).Brazil, 2019.